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Is PRISM RISC"? 



• RISC = Reduced Instruction Set 
Computer 

— IBM 801 

— Berkeley 

— Titan, Safe, Cascade... 

• Small Instruction Set 

• Instructions in Hardware 

• "RISC" compared to VAX? 
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But... 

PRISM isn't RISC! 

PRISM is: 



• Parallel Architecture 



• Vectors 



• Some "RISC" Concepts 



• Some "CISC" Concepts 



• Designed for PERFORMANCE 
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PRISM ARCHITECTURAL 
GOALS 



• High (Absolute) Performance 



• 2:1 (or better) Cost/Performance 



VAX Compatibility 



• VAX Extension Architecture for 1990's 
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Why Not Build 
A Faster VAX? 



• Complex (microcoded) instruction set 

• Variable length instructions 

• Many addressing modes (good & bad) 

• 512 byte page size 

• Autoincrement & Autodecrement 

• Condition codes force synchronization 

• Lots of unused functionality 
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PRISM 
ARCHITECTURE OVERVIEW 



32-bit architecture 



32-bit virtual address space 



45-bit physical address space 



VAX compatible memory addressing 



VAX compatible data types 



Scalar and vector processing 
Symmetric multiprocessing 
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SCALAR PROCESSING 



• 64 32-bit registers 

• 8-, 16-, 32-bit integers/logicals 

• 32- and 64-bit F and G Floating 

• Parallel instruction execution 



• Comprehensive, yet simple, instruction 
set 



• Load/Store memory referencing 
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VECTOR PROCESSING 



16 vector registers 

64 elements per vector register 

64-bits per vector element 

32-bit integer/logicals 

32- and 64-bit F and GFIoating 

Single instruction processes entire 
vector register 

Similar to Cray-2 vector functionality 
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MEMORY MANAGEMENT 



• 32-bit virtual address space 

• Basis for relocation, protection and 
paging 

• Execute protection for proprietary code 

• 8KB page size for added TB efficiency 
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PRISM ADVANTAGES 



• Fixed length instructions 

• Lots of registers 

• Parallel execution; out-of-order 
completion 

• No (synchronous) condition codes 

• No compound instructions 

• No microcode required 

• Large pages 
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CRYSTAL 
HARDWARE SUMMARY 
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Crystal Processor 



Basics: 

• PRISM Architecture 

• Air-Cooled, ECL Multiprocessor (1-4) 

• Scalar with vector option(s) 

Performance: 

• 3X equivalent VAX per scalar unit 

• 100+ MFLOPS (Peak) per vector option 

• Memory Size = 128 to 512 MBytes 

• I/O Bandwidth > 50 MBytes/sec 

Transfer Costs: 

Proc Memory MLP TC Markup 

Entry Kernel 2 128 MB S 1465K S 169K 8.7X 

Max Kernel 4 512 MB S 3733K $ 451 K 8.3X 
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Crystal Scalar Unit 



Performance through both fast clock cycle 
and parallelism 



• High speed ECL gate arrays - 15nS cycle 



• Retire an instruction each cycle 



• Four independent function units 



• Fully pipelined multiply and add 



• Separate instruction and data caches 



• Data cache is writeback 
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Crystal Vector Option 



Architecture and fast clock cycle provide 
very high throughput 



• Sixteen multiported vector registers 

• Four autonomous function units 

• Fully pipelined multiply and add 

• 132 Mflops peak performance 

• 2 board addition to Scalar processor 
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Crystal Memory System 



Parallelism used to achieve very high 
performance 



• 1-2+ Gigabytes/Second 



• MultiLevel cache hierarchy 

• Main memory is 32 way interleaved 



• Memory size 128 to 512 MBytes 
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Crystal I/O System 



Multiple Channels and Independent 
Processors used to achieve high bandwidth 



• VAXBI allows standard DEC devices 



• I/O Processors off-load main CPU 



• VAX as IOP allows BCA 



• Memory on IOP maximizes Bl bandwidth 



• Eight VAXBIs provide 64Mbytes/sec 
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PRISM/VMS 
Software Summary 
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PRISM/VMS GOALS 



• Quality 

• Robustness, Extendability, and 
Maintainability 

• New Functionality 

• VMS Compatibility 

• Schedule 

• Performance 
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SOFTWARE SUMMARY 



• Very similar to VAX/VMS 

• ULTRIX 

• Cluster Support 

• Symmetrical MP 

• Vector Support 

• Multitasking 
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SOFTWARE SUMMARY - CONT. 



• PILLAR Systems Implementation 
Language 

• Layered Languages: 

- Vectorizing FORTRAN 

- BLISS 

- Pascal 

- C 
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VMS COMPATIBILITY 



At user interfaces and at VAX/VMS 
interfaces: 



System services via compatibility layer 

Disk and Magtape structures 

DCL and utilities 

DECnet and remote terminal support 

Clusters 

Languages, RTL, and debugger 

Layered Products 
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PRISM COMPETITION 
SUMMARY 
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Competition 



IBM 



• Technology Leaders 



Others 
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High Technology Companies 



• Convex (C-1) 



• Alliant (FX/1 and FX/8) 



• Scientific Computer Systems (SCS) 

• Elxsi (System 6400) 

• Floating Point Systems (FPS-164 and 
FPS-264) 



• Market Share for These & Other High 
Technology Companies 

. 40/0 FY85 ($625K-1.6M Priceband) 

. 20/0 FY85 ($1 .6M + Priceband) 
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Product Comparison Summary 

Crystal IBM Convex Alliant Amdahl Cray 

Model 1-4 3090 C-1 FX/8 5890 2 

# CPUs 200&400 300&600 



# Proc 1-4 2,4 5 1-8 2,4 4 

VUPs 30-100 21-38 10 3.5-26 30-54 120* 

Cost per S20K $183K- $52K $39K S170K- S147K* 
VUP $188K $172K 

MFLOPS 100+ 100 60 94 1600 

-200 + -200 

System S571K- $3.9M- S515K S270K- $5.1M- $17.6M 
Price $2.15M $7.2M $1.0M $9.3M 

VP Yes Yes** Yes Yes No Yes 

* Estimated 

** Attached Vector Processor 

Please note: Crystal systems (which FRS in FY89) 
are compared to currently shipping competitive 
systems. 
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Alliant FX/8 



• Price/Performance 

. 3.5-26 VUPs - $39K per VUP (1-8 
Processors) 

. 94 MFLOPS 

. $270K-1.0M Systems 

• 1-8 Processors 

• Vector Processing 

• Compiler Technology (Decomposing, 
Vectorizing) 



• Full Fortran Support for Vector Hardware 
Parallelism 
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Crystal and Alliant Product Comparisons 



Crystal Alliant 



# of Processors 



1-4 



1-8* 



Processor Bits 
Cycle Time 
Cache 



32 



15ns 



32/64 



170ns 



64-256K 64-1 28K 



System Memory 
Memory Speed 



I/O Architecture 



64-51 2MB 8-64MB 



800MB 



IOP 



188MB 



IOP 



Vector Processor 



Yes 



Yes 



1-8 Computational and 12 Interactive Processors 
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Convex C-1 



• Price/Performance 



10VUPs-$52KperVUP 



60 MFLOPS 



Entry System $51 5K 



• 5 Processors 



• Vector Processing 
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Crystal and Convex Product Comparisons 



Crystal Convex 



# of Processors 



1-4 



Processor Bits 



32 



32/64 



Cycle Time 
Cache 



15ns 



50ns 



64-256K 64K 



System Memory 
Memory Speed 



I/O Architecture 



64-51 2MB 4-1 28MB 



800MB 



IOP 



80MB 



IOP 



I/O Bandwidth 



64-100MB 80MB 



Vector Processor 



Yes 



Yes 



With True Parallelism 
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Cray-1 and 2 



• Cray-1 



Entry System $8.8M 
250 MFLOPS 
64 Processor Bits 
12.5ns Cycle Time 
24 Channels 



• Cray-2 



4 Processors 
Entry System $17.6M 
1600 MFLOPS 
64 Processor Bits 
4.1ns Cycle Time 
40 Channels 
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Crystal and IBM Product Comparisons 



Crystal IBM 3090 



# of Processors 



1-4 



2,4 



Processor Bits 



32 



32 



Cycle Time 
Cache 



15ns 



18.5ns 



64-256K 64-256K 



System Memory 



I/O Architecture 



64-512MB 64-256MB 



IOP 



Channels 



I/O Bandwidth 



64-1 00MB 96-288MB 



Vector Processor 



Yes 



Yes 
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MAKING A SUCCESS 
OF PRISM 
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PRODUCT POSITIONING 



• MARKET 

— High performance 

— Scientific computation 

— Engineering 

— Research 

• With AQUARIUS 

— Crystal = high end scientific, 
computational 

— Aquarius = high end commercial, MIS 

• With ARGONAUT 

— Dual processor entry above Argonaut 

— Argonaut = mid-range VAX processor 
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CONCERNS 



• Software schedules tight 



• Too much (?) to do now 



• FY89 will be here tomorrow 



• Software will be gating item 
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HOW DO WE SUCCEED? - CONT. 



• Deliver minimal, but key layered 
software: 



- Best Vectorizing FORTRAN (period) 

- Supporting tools (LSE, MMS, CMS, 
PCS, etc.) 

- Sell performance - use the iron! 
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HOW DO WE SUCCEED? 



• Keep our focus narrow: 



— Scientific computing 



- Member of VAX family 



— Top Fortune companies 
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HOW DO WE SUCCEED? - CONT 



• Catch up with VAX over several years 



— Schedule layered product 
introductions 



— Complementary offerings (product 
families) 
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